Robust Voiced/unvoiced Classification Using Novel Features and Gaussian Mixture Model
نویسندگان
چکیده
Need for deciding whether a given frame of a speech waveform should be classified as voiced speech or unvoiced speech arises in many speech analysis systems. Several approaches have been described in the literature for making this decision. In this paper, we have presented two novel approaches of using acoustical features and pattern recognition. The first method is based on Mel frequency cepstral coefficient with Gaussian mixture model classifier, which resulted in approximately 90% identification accuracy and the other is based on LPC coefficient and reduced dimensional LPC residual with Gaussian mixture model classifier, which resulted in 92% identification accuracy. The performances of both approaches were compared for various levels of noise and optimum condition for training is determined.
منابع مشابه
Efficient Implementation of Voiced/Unvoiced Sounds Classification Based on GMM for SMV Codec
In this letter, we propose an efficient method to improve the performance of voiced/unvoiced (V/UV) sounds decision for the selectable mode vocoder (SMV) of 3GPP2 using the Gaussian mixture model (GMM). We first present an effective analysis of the features and the classification method adopted in the SMV. And feature vectors which are applied to the GMM are then selected from relevant paramete...
متن کاملA Comprehensive Noise Robust Speech Parameterization Algorithm Using Wavelet Packet Decomposition-Based Denoising and Speech Feature Representation Techniques
This paper concerns the problem of automatic speech recognition in noise-intense and adverse environments. The main goal of the proposed work is the definition, implementation, and evaluation of a novel noise robust speech signal parameterization algorithm. The proposed procedure is based on time-frequency speech signal representation using wavelet packet decomposition. A new modified soft thre...
متن کاملRobust voiced/unvoiced speech classification using empirical mode decomposition and periodic correlation model
This paper presents a method of voiced/unvoiced (V/Uv) classification of noisy speech signals. Empirical mode decomposition (EMD), a newly developed tool to analyze nonlinear and non-stationary signals is used to filter the additive noise with the speech signal. The normalized autocorrelation of the filtered speech signal is computed to enhance the periodicity if any. It is considered that the ...
متن کاملHMM-based MAP Prediction o Formant Frequencies from N
This paper describes how formant frequencies of voiced and unvoiced speech can be predicted from mel-frequency cepstral coefficients (MFCC) vectors using maximum a posteriori (MAP) estimation within a hidden Markov model (HMM) framework. Gaussian mixture models (GMMs) are used to model the local joint density of MFCCs and formant frequencies. More localised prediction is achieved by modelling s...
متن کاملVoicing classification of visual speech using convolutional neural networks
The application of neural network and convolutional neural network (CNN) architectures is explored for the tasks of voicing classification (classifying frames as being either non-speech, unvoiced, or voiced) and voice activity detection (VAD) of visual speech. Experiments are conducted for both speaker dependent and speaker independent scenarios. A Gaussian mixture model (GMM) baseline system i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003